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ABSTRACT 

We introduce Deep Neural Programs (DNP), a novel pro¬ 
gramming paradigm for writing adaptive controllers for cy¬ 
ber-physical systems (CPS). DNP replace if and while state¬ 
ments, whose discontinuity is responsible for undecidability 
in CPS analysis, intractability in CPS design, and frailness 
in CPS implementation, with their smooth, neural nif and 
nwhile counterparts. This not only makes CPS analysis de¬ 
cidable and CPS design tractable, but also allows to write 
robust and adaptive CPS code. In DNP the connection be¬ 
tween the sigmoidal guards of the nif and nwhile statements 
has to be given as a Gaussian Bayesian network, which re¬ 
flects the partial knowledge, the CPS program has about its 
environment. To the best of our knowledge, DNP are the 
first approach linking neural networks to programs, in a way 
that makes explicit the meaning of the network. In order to 
prove and validate the usefulness of DNP, we use them to 
write and learn an adaptive CPS controller for the parallel 
parking of the Pioneer rovers available in our CPS lab. 
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1. INTRODUCTION 

Recent advances in sensing, actuation, communication, and 
computation technologies, as well as innovations in their 
integration within increasingly smaller, interconnected de¬ 
vices, has lead to the emergence of a new and fascinating 
class of systems, the so-called cyber-physical systems (CPS). 
Examples of CPS include smart grids, smart factories, smart 
transportation, and smart health-care [2j. 

Similarly to living organisms, CPS operate in an uncertain, 
continuously evolving ecosystem, where they compete for 
a limited supply of resources. For survival, CPS need to 
continuously adapt, such that, they react in real time and 
optimal fashion, with regard to an overall survival metric, 


their partial knowledge, and their bounded sensing, actua¬ 
tion, communication and computation capabilities. 

In order to equip CPS with such exceptional features, vari¬ 
ous researchers have started to wonder weather our current 
CPS analysis, design and implementation techniques are still 
adequate. Going back to Parnas, Chaudhuri and Lezama 
identified in a series of intriguing papers [6|7|25] , the if-then- 
else construct as the main culprit for program frailness. In 
a simple decision of the form if (x > a), the predicate x > a 
acts like a step function (see Figure l|, with infinite plateaus 
to the left and to the right of the discontinuity point x = a. 
In a typical mid-size program, the nesting of thousands of 
if-then-else conditions leads to a highly nonlinear program, 
consisting of a large number of plateaus separated by dis¬ 
continuous jumps. This has important implications. 

From a CPS-analysis point of view, predicates of the form 
f(x)>a, where f(x) is a nonlinear analytic function, are 
a disaster. They render CPS analysis undecidable. Intu¬ 
itively, in order to separate all points on one side of the curve 
f(x) =a, from all on the other side, one needs to forever de¬ 
crease the size of a grid, in all the rectangles that are crossed 
by the curve. Such a process does never terminate, except 
for linear functions where computation is still prohibitive. 
For this reason, a series of papers, of Fraenzle, Ratschan, 
Wang, Gao and Clarke [11||12[|27||32| , proposed the use of 
an indifference region <5 (see Figure fit), and rewrite the predi¬ 
cates in the form f(x)—a > 5. This approach not only makes 
program analysis (wrt. reals) decidable, and computable in 
polynomial time, but it also aligns it with the finite compu¬ 
tational precision available in today’s computers. 

From a CPS-design point of view, where one is interested to 
find the values of a for which an optimization criterion is sat¬ 
isfied, predicates of the form f(x) > a are a nightmare. They 
render CPS optimization intractable. Intuitively, a gradient- 
descent method searching for a local minimum, gets stuck 
in plateaus, where a small perturbation to the left or to the 
right, still keeps the search on the same plateau. In order 
to alleviate this problem, Chaudhuri and Lezama [6] pro¬ 
posed to smoothen the steps, by passing a Gaussian input 
distribution through the CPS. This can be thought of as cor¬ 
responding to the sensing and actuation noise. The parame¬ 
ters of this distribution control the position of the resulting 
sigmoidal curve (see Figure [I]), and its steepness, that is, 
the width of the above indifference region 5. The authors 
however, stopped short of proposing a new programming 






paradigm, and the step-like functions in the programs to be 
optimized, posed considerable challenges in the analysis, as 
they cut the Gaussians in very undesirable ways. 



Figure 1: Sigmoid (blue) and step (black) functions. 


From a CPS-implementation point of view, conditional state¬ 
ments of the form if (f (x) > a) are also a disaster. They 
render CPS frail and nonadaptive. In other words, a small 
change in the environment or the program itself, may lead to 
catastrophic consequences, as the CPS is not able to adapt. 
In the AI community, where steps are called hard neurons 
and sigmoid curves are called soft neurons , adaptation and 
robustness is achieved by learning a particular form of Ba¬ 
yesian networks with soft-neuron distributions, called neu¬ 
ral networks. Such networks, and in particular deep neural 
networks, have recently achieved amazing performance, for 
example in the recognition of sophisticated patterns [8 10 


This technology looks so promising that major companies 
such as Google and Amazon are actively recruiting experts 
in this area. However, the neural-networks learned are still 
met with considerable opposition, as it is very difficult, if 
not impossible, to humanly understand them. 


Having identified the if-then-else programming construct as 
the major source of trouble in the analysis, design and im¬ 
plementation of CPS, the following important question still 
remains: Is there a simple, humanly understandable way to 
develop robust and adaptive CPS? It is our belief, that such 
a way not only exists, but it is also amazingly simple! 


may possess a quad-tree Bayesian structure, hierarchically 
reflecting the neighbourhood relation among subimages. The 
depth of the network is determined by the height of the tree. 
Similarly, a purely sequential program, representing succes¬ 
sive decisions, will have a very linear Bayesian structure, 
whose depth is determined by the number of decisions. 

In order to validate our new paradigm, we use the parking 
example from [5] . The goal of this example was to automati¬ 
cally learn the parameters of a program skeleton, intuitively 
expressing the control as follows: Go backwards up to a 
point ai, turn up to an angle bi, go backwards up to a 2 , turn 
again up to 62 and finally go backwards up to 03 . Since this 
program uses classical if statements, it is not adaptive, and 
a small perturbation such as a slippery environment, may 
lead to an accident. We therefore rewrite the program with 
nif statements, and learn the conditional Gaussian network 
associated with the predicates within these statements. Us¬ 
ing its sensors, the control program is now able to detect the 
actual stopping or turning points, and to adequately sample 
its next targets. Although this program is written once and 
for all, it is able to adapt to a varying environment. 

The main contributions of the work presented in this paper 
can be therefore briefly summarized as follows: 

1. We propose a new programming paradigm for the de¬ 
velopment of smooth and adaptive CPS in which: 

• Troublesome ifs are replaced by neural ifs, thus 
improving analysis, design and implementation, 

• Partial knowledge is encoded within a learned Ba¬ 
yesian network, with Gaussian distributions. 

2. We demonstrate the versatility of this programming 
paradigm on a parking example using Pioneer rovers. 
The associated youtube videos are available at [2]. 

Given obvious space limitations, we do not address CPS- 
analysis and CPS-design (optimization) in this paper. They 
will be the subject of a series of follow up papers. 


First, as program skeletons express domain knowledge and 
developer intuition, they are here to stay. However, one 
needs to replace hard neurons with their soft counterparts. 
We call such program statements neural if-then-else state¬ 
ments, or nif-then-else for short. They represent probabilis¬ 
tic, probit distributions, and the decision to choose the left 
or the right branch is sampled from their associated Gaus¬ 
sian distributions. As a consequence, a program with nif 
statements represents not only one, but a very large (up to 
the computational precision) set of correct executions. 

Second, the partial knowledge of such a program is encoded 
as a Bayesian network, expressing the conditional dependen¬ 
cies among the Gaussian distributions occurring within the 
nif statements. These dependencies may be given, learned 
through a preliminary phase and continuously improved dur¬ 
ing deployment, or inferred through optimization techniques. 
In this case, learning and optimization are considerably sim¬ 
plified, as the program is by definition smooth. The depth 
and the branching structure of these Bayesian networks re¬ 
flect the sequential and parallel nesting within the program, 
which is an essential asset in program understanding. 


The rest of the paper is organized as follows. In Section[2]we 
introduce Bayesian inference, Bayesian networks, and Gaus¬ 
sian and Probit distributions. In Section [3] we introduce our 
programming paradigm. In Section[4]we discuss how to learn 
the Gaussian Bayesian network. In Section [5] we discuss our 
implementation platform and the associated results. In Sec¬ 
tion [ 6 ] we discuss related work. Finally in Section [7] we give 
our concluding remarks and directions for future work. 

2. BACKGROUND 

The main tool for logical inference is the Modus-Ponens rule: 
Assuming that proposition A is true, and that from the truth 
of A one can infer the truth of proposition B, one can con¬ 
clude that propositions A and B are both true. Formally: 

AA(A^B) = A A B = BA(B^A) 

In probability theory, the uncertainty in the truth of a propo¬ 
sition (also called an event) is expressed as a probability, 
and implication between propositions is expressed as a con¬ 
ditional probability. This leads to a probabilistic extension 
of Modus-Ponens, known as the Bayes’ rule. Formally: 


For example, a parallel algorithm for pattern recognition, 


P(A) P(B | A) = P(A A B) = P(B) P(A \ B) 













This rule, consistent with logic, is the main mechanism for 
probabilistic inference [28]. It allows to reason in both for¬ 
ward, or causal way, and backwards, or diagnostic way. For 
example if B is causally implied by A, then the left term in 
the above equation denotes a causal relation, and the right 
term, a diagnostic relation. Equating the two, allows one to 
use causal information (or observed events), for diagnostic 
inference. In real-world systems, causal relations are usually 
chained and can form sophisticated structures. 

Bayesian Networks. A probabilistic system is completely 
characterized by the joint probability distribution of all of its 
(possibly noisy) components. However, the size of this dis¬ 
tribution typically explodes, and its use becomes intractable. 
In such cases, the Bayes’ rule, allows to successively decom¬ 
pose the joint distribution according to the conditional de¬ 
pendences among its random variables (RV). These are both 
discrete or continuous variables, which associate to each 
value (or infinitesimal interval) in their range, the rate of its 
occurrence. Networks of conditional dependencies among 
random variables are known as Bayesian networks (BN), 
and they have a very succinct representation. 

Syntactically, a BN is a direct acyclic graph G = (V,E), 
where each vertex Vi £ V represents a random variable X, 
and each edge dj € E represents a conditional dependence 
of the variable Xj on the variable X,. To avoid the com¬ 
plications induced by the use of the joint probability dis¬ 
tribution (or density), each variable Xi is associated with 
a conditional probability distribution (CPD) that takes into 
account dependencies only between the variable and its di¬ 
rect parents [20||28| . Such a compact representation keeps 
information about the system in a distributed manner and 
makes reasoning tractable even for large number of variables. 

Although in many interesting applications the variables of a 
BN have discrete distributions (e.g. in fault detection, a de¬ 
vice might have only a finite number of diagnosable errors, 
caused by a finite set of faults), in many other applications, 
continuous random variables naturally describe the entities 
of interest. For instance, in our parallel parking running 
example, a Pioneer rover starting from an initial location, 
needs to execute a sequence of motion primitives (e.g. driv¬ 
ing or turning forward or backward with fixed speed for a 
particular distance Xi or angle Xj), which will result in 
parking the rover in a dedicated parking spot. 

Gaussian Distributions. Any real measurement of a phys¬ 
ical quantity is affected by noise. Hence, the distances and 
the angles occurring in our parking example are naturally 
expressed as continuous RVs. In this paper we assume that 
variables have Normal, also called, Gaussian distributions 
(GD). These distributions naturally occur from the mixing 
of several (possibly unobservable) RVs, and they have math¬ 
ematical properties making them very attractive. 

An univariate Gaussian distribution (UGD) is denoted by 
N(p,a 2 ) and it is characterized by two parameters: The 
mean p and the variance a 2 . In our example, the desired 
distance in the first motion is associated with p, which is 
perturbed by noise with variance a 2 . The Gaussian proba¬ 


bility density of a RV X with values x is defined as follows: 

(1) 

Parallel parking includes a sequence of motion primitives 
that are mutually dependent. To express these dependencies 
we use a multivariate Gaussian Distribution (MGD) [15] , 
which generalizes the Gaussian distribution to multiple di¬ 
mensions. For a n-dimensional vector of random variables 
X the probability density function is characterized by a n- 
dimensional mean vector p and a symmetric positive definite 
covariance matrix S. To express the probability density of 
a multivariate Gaussian distribution we use the inverse of 
covariance matrix, called precision matrix T = , which 

will be helpful later during the learning phase. The proba¬ 
bility density then can be written as follows [24] : 

pdf - 2(x) = ( 2 (2) 
where A 2 (x) = (x — p) T T(x — p). 

A Gaussian BN (GBN) is a BN where random variables X 
associated to each node in the network have associated a 
Gaussian distribution, conditional on their parents Xi. 

Probit Distributions. In order to smoothen the decisions 
in a program, we need to choose a function without plateaus 
and discontinuities. Since we operate with Gaussian random 
variables, the natural candidate is their cumulative distri¬ 
bution function (CDF). This is an 5-shaped function or a 
sigmoid (see Figure [2b, whose steepness is defined by a 2 , 
where erf denotes the error function: 

cdf M , CT2 (x) = ^l + erf (3) 

For a particular value x of X, the function cdf 2 (x) returns 
the probability that a random sample from the distribution 
N(p,a 2 ) will belong to the interval (—oo,x]. 

Since the sensors and actuators of the Pioneer rover are 
noisy, the trajectories it follows are each time different from 
the optimal one (assuming that such difference is tolerated 
by the parking configuration), even if the optimal trajec¬ 
tory of the parking example is known. To be adaptive we 
use a combined approach: we incorporate probabilistic con¬ 
trol structures in the program (introduced in the Section [3| 
and sample commands from a GBN, whose parameters were 
learned experimentally. To detect changes in the environ¬ 
ment and get more accurate position estimates, data from 
various sensors are combined with a sensor fusion algorithm. 

3. NEURAL CODE 

Traditional inequality relations (e.g. >, >, <, <) define a 
sharp (or firm) boundary on the satisfaction of a condition, 
which represents a step function (see Fig [I]). Using firm 
decisions in a program operating on Normal RVs, cuts dis¬ 
tributions in halves, leaving unnormalized and invalid PDFs 
(see Figure[3] The upper right plot shows the approximation 
of the PDF after passing a Normal RV through a traditional 
conditional statement). Hence, to keep a proper PDF after 
passing a Normal RV through an if or a loop statement one 
needs to perform a re-normalization of the PDF. 







Iii order to avoid re-shaping of probability density each time 
after a variable is passed through a condition, we introduce 
a special type of control structure called neural if, or nif 
for short. The name is coined to express the key novelty 
of our approach: We propose to use a smooth conditionals 
cdf M CT 2 ( 2 ;) instead of firm ones (see Figure [l). A nif state¬ 
ment operates on an inequality relation and a variance a 2 , 
and decides which branch should be taken: nif(x # y,cr 2 ), 
where if can be replaced with (>, >, < or <) and <r 2 rep¬ 
resents the uncertainty of making a decision. For the case 
when <t 2 —> 0 (no uncertainty) we require the nif statement 
to behave as a traditional if statement. 

The evaluation of the nif () statement is explained on hand 
of the following example, where x, a 6 1Z and <r 2 £ 1Z + . 


interval is (— 00 ; + 00 ) and includes every possible sample; 
for the second case the probability of taking SI is 0, hence 
the interval is empty and cannot contain any sample. An if 
statement is an nif statement without uncertainty. 

Let us illustrate the concept on a concrete example. Suppose 
that in the current execution x = 0 and a = 1. Figure [2] 
illustrates how decisions are made if a 2 is: 0 . 4 2 , 7 r, 4 2 . Since 
diff(x,a) = 1, the probability of executing SI is defined by 
cdf 0 o . 2 (l) and for the above cases is equal to 0.994, 0.714 
and 0.599 respectively. The intervals for the above cases are 
as follows: [-1.095; 1.095], [-1.890; 1.890] and [-3.357; 3.357]. 
In the second step we sample from the distributions with 
the corresponding <r 2 (A/”( 0 ,0.4 2 ), JV(0, n) and JV(0, 4 2 )) and 
check whether the value lies within the intervals. 


nif ( x >= a, cr 2 ) SI else S2 


The evaluation is done in two steps: (i) Find an 1Z inter¬ 
val I representing the confidence of making the decision; 
(ii) Check if a sample from the GD A/”(0, a 2 ) belongs to I. 


Since the input RV has a GD, and a GD is used to evaluate 
the condition, the result is a product of two GDs, which is 
also a GD scaled by some constant factor k. To find / in 
(i), we estimate the difference diff (x,a) between x and a. 
For the general case nif (x if a,< 7 2 ), with arbitrary ff, the 
difference diff (x, a) is defined as below, where e represents 
the smallest real number on a computer. 


diff(x,a) 



a 

a 


a — e 

if# is 

> 

a 

if# is 

> 
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if# is 

< 

X 

if# is 
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Informally, our confidence is characterized by the difference: 
The larger diff (x, a) is, the larger is the probability of 
executing SI. The probability to execute SI is given by 
cdf 0 CT 2 (diff (x,a)) and is used to obtain the interval [gi; 52 ] 
by calculating two symmetric quantiles qi and 52 such that: 


ri 2 

/ pdf Oi 0 2 (x)dx = cdf O 02 (diff (x, a)). 
J gl 


(4) 


In the second step a random sample from the distribution 
JV(0,a 2 ) is checked to belong to the interval [qi',q 2 ]- If it 
is within the interval, SI is executed, otherwise S2. At this 
point the probability value to execute SI is influenced by 
the variance a 2 (see Figure^. Hence, the dependence is 
twofold: diff (x,a) shows how confident we are in making 
a decision, and a 2 characterizes the uncertainty. 


For the case <r 2 —> 0 the nif statement is equivalent to the 
if statement. For infinitesimal a 2 the PDF is expressed as 
a Dirac function <5(a:), which has the following properties: 

(i) 5(x) = +00 if x — 0 else 0 

(“) f- ao 5 ( x )dx=l. 

The Dirac function essentially concentrates all the PD in a 
single point o: = 0. In this case the cdf 0 02 _> o (a;) becomes a 
step function (see Figure [I]). We consider two cases, as fol¬ 
lows: (i) diff (x,a) >0 and (ii) diff (x,a) <0. In the first 
case the probability of executing SI is equal to 1 , hence the 



Figure 2: CDFs, PDFs and the quantiles for x = 1 

So far we were concerned with single samples x ~ A/”(/r, <r 2 ). 
Figure [3] illustrates what happens to the distributions: The 
differences of passing a GD RV x ~ A/”(0, 0.1) through the 
statements if (x >= 0.15) and nif (x >= 0.15, 0.1). Us¬ 
ing our approach the GD is not cut in undesirable ways, and 
it maintains its GD form after passing the nif statement. 



Figure 3: Passing RVs through conditions 

We can introduce now the confidence-uncertainty trade-off 
to loops. The neural while statement, or nwhile for short, is 
an extension of a traditional while statement which incor¬ 
porates uncertainty. The statement nwhile( x ff a, <j 2 ){Pi} 
takes an inequality relation and variance <r 2 and executes 
the program Pi according to the following rule: (1) Com¬ 
pute diff (x ff a) and obtain quantiles qi and q 2 according to 
the Equation[4j (2) Check if a random sample x ~ _/V(0, a 2 ) 
is within the interval [qi ; 52 ]; (3) If the sample belongs to 
the interval, execute Pi and go to step (1), else exit. 

Since the nif and nwhile statements subsume the behavior 



































of traditional if and while statements (the case a 2 —> 0), we 
use them to define an imperative language with probabilistic 
control structures. Binary operators bop (such as addition 
multiplication), unary operators uop (negation), and con¬ 
stants c are used to form expressions E. A program P is a 
statement S or combination of statements. 

E ::= Xj | c | bop(Ei,E 2 ) | uop(Ei) 

S ::= skip | x; := E | Si; S 2 | nif(x; fi=c,a 2 ) Si else S 2 
nwhile( Xi # c, <r 2 ){ Si } 

In order to define the denotational semantics for the nif and 

the nwhile statements, we use check(xi, e, o 2 , ff), a function 
which: (1) Computes the difference diff(xi,c), (2) Finds 
quantiles qi and q 2 (Equation[4|, and (3) Checks if a sample 
x ~ A/”(0, o 2 ) belongs to the interval [qi; q 2 \. If it does, it re¬ 
turns value 1, otherwise it returns value 0. The denotational 
semantics of neural programs is then defined as follows: 

[skip] (x) = x 

[x» := E ](x) = x[[£](x) Xl ] 

[Si;S 2 ](x)= [S 2 ]([Si](x)) 

[nif (xi#c,cr 2 ) Si else S 2 ](x) = 
[check(xi,a,a 2 ,#)](x)[Si](x) + 

[-icheck(xi, a, cr 2 , #)](x)[S 2 ](x) 

[nwhile(xi # c, <r 2 ){ Si }](x) = 
x[->check(xi, a, <r 2 ,#)](x) + 

[check(xi,a,cr 2 ,#)](a:)[nwhile(xi #c, a 2 ){ Si }]([Si]x) 

We are now ready to write the control-program skeleton for 
the parallel parking task of our Pioneer rover, as a sequence 
of nwhile statements, as shown in Listing [T] Each nwhile 
corresponds to executing one motion primitive of the infor¬ 
mal description in Section [Tj The functions moving () and 
getPoseO are output and input statements, which for sim¬ 
plicity, were omitted from the denotational semantics. 

Listing 1: Parallel parking program skeleton 

nwhile ( currentDi st ance < targetLocat ionl , sigmalM 
moving () ; 

currentDistance = getPose(); 

> 

updateTargetLocations () ; 

nwhile ( current Angle < targetLocat ion2 , sigma2)-C 
turning (); 

currentAngle = getAngle () ; 

> 

updateTargetLocations () ; 

nwhile ( currentDi st ance < targetLocat ion3 , sigma3)-C 
moving () ; 

currentDistance = getPose(); 

> 

updateTargetLocations () ; 

nwhile ( current Angle < targetLocat ion4 , sigma4)-C 
turning (); 

currentAngle = getAngle () ; 

> 

updateTargetLocations () ; 

nwhile ( currentDistance < targetLocat ion5 , sigma5)-C 
moving () ; 

currentDistance = getPose(); 

> 


The versatility of this approach is that the program skeleton 
is written only once and comprises infinite number of con¬ 
trollers. The question we need to answer next is: 

What are the distances and turning angles for each action 
and how uncertain are we about each of them? 

To find the unknown parameters from Listing [l] namely the 
target locations targetLocations and variances sigmas, we 
use the learning procedure described in Section [4] 

4. BAYESIAN-NETWORK LEARNING 

Parking can be seen as a sequence of moves and turns, where 
each action depends on the previous one. For example, the 
turning angle typically depends on the previously driven dis¬ 
tance. Due to sensor noise and imprecision, inertia and 
friction forces, and also many possible ways to perform a 
parking task starting from one initial location, we assume 
that the dependence between actions is probabilistic, and in 
particular, the RVs are distributed according to Gaussian 
distributions (GD). We represent the dependencies between 
actions as the GBN in Figure [4j where h or ay- denotes a 
distance or a turning angle of the corresponding action and 
bij is a conditional dependence between consecutive actions. 



Figure 4: Gaussian Bayesian Network for parking 

In order to learn the conditional probability distributions of 
the GBN in Figure [4] and to fill in the targetLocations 
and the sigmas in Listing [l] we record trajectories of the 
successful parkings done by a human expert. Figure [5] shows 
example trajectories used during the learning phase. 



Figure 5: Example trajectories for the parking task 

We than use the fact that any GBN can be converted to an 
MGD [24] in our learning routine. Learning the parameters 
of the GBN can be divided into three steps: 

1. Convert the GBN to the corresponding MGD, 

2. Update the precision matrix T of the MGD, 

3. Extract cr 2 s and conditional dependences from T. 

1. Conversion step. To construct MGD we need to obtain 
the mean vector p and the precision matrix T. The mean 
vector p comprises the means of all the variables from the 
GBN. To find the symbolic form of the precision matrix, we 


















use the recursive notation in [16], where the value of the 

coefficients bi, will be learned in the update step below. The size of the training set v is updated to its new value: 


Ti+i 


T + 


i + l D i + l 

'f+1 

'i + 1 



(5) 


In order to apply Equation [5] we dehne an ordering starting 
with the initial node l \. Its precision matrix is equal to: 



The vector bi comprises dependence coefficients for node i 
on all its immediate parents it in the ordering. For example, 
the dependence vector for node 02 in the Figure [4] equals to: 

(° 

64 = 0 

\b43 

After applying the Equation[5]to each node in the GBN, we 
obtain the precision matrix T 7 , shown in Figure [~i~2{ Since 
each action in the parking task depends only on the previous 
one (for example, in Figure [4] the turning angle depends on 
the previously driven distance only), we can generalize the 
precision matrix for the arbitrary number of moves. For a 
GBN with k moves, all non-zero elements of the precision 
matrix T £ lZ k,k can be found according to the Equation]^] 
where T(r, c) is a c-th element in a r-th row of the precision 
matrix with indices started from one. 


T(i,i-l) = -%^, 


T(i,i)-4 + -^ i , 


T(i, i + 1) = — 


i+1 

9 » 


( 6 ) 


v* = v + M (9) 

The updated covariance matrix /3* combines the prior ma¬ 
trix /3 with the covariance matrix of the training set s: 


=£ 


X^ — X 


P* =P + a + 


v + M 


) 




( 10 ) 


Finally, the new value of the matrix ft is used to calculate 
the covariance matrix (T*)“ , where a* = a + M. 


(TT 1 


v* + 1 

v*(a* — n + 1 ) 


P* 


( 11 ) 


3. Extraction step. The new parameters of the GBN can 
now be retrieved from the updated mean vector p* and from 
(T*) _ . If new traces are available at hand, one can up¬ 
date the distributions by recomputing p* and (T*) _1 using 
Equations HES We depict the whole process in Figure [6j 
Unknown parameters from the program skeleton are learned 
from successful traces and these dependencies are used dur¬ 
ing the execution phase to sample the commands. 



2. Update step. Once we derived the symbolic form of the 
precision matrix (T 7 in our example), we use the training 
set, in order to learn the actual values of its parameters, as 
described in the algorithm from [24]. Each training example 
corresponds to a vector of lengths and turning angles 
for a successful parking task. The total number of examples 
in the training set is M. The procedure allows us to learn 
iteratively and adjust the prior belief by updating the values 
of the mean p and covariance matrix p of the prior, where v 
is a size of a training set for the prior belief, and a = v — 1. 

T,(a-n + l) i 

v+ 1 ’ w 

The updated mean value p* incorporates prior value of the 
mean p, and the mean value of the new training examples x. 




M x (0 
i =1 _ 

M 
vp + Mx 


E m 
i=i x 

^ M 


( 8 ) 


Figure 6 : Learning Parameters in a Neural Program 


5. EXPERIMENTAL RESULTS 

We performed our experiments on a Pioneer P3AT-SH mo¬ 
bile rover from Adept MobileRobots [l] (see Figure [7|. The 
rover uses the Carma Devkit from SECO [2l] as a main com¬ 
putational unit. The comprised Tegra 3 ARM CPU runs the 
Robot Operating System (ROS) on top of Ubuntu 12.04. 


5.1 Structure of the Parking System 

The parking system can be separated into several building 
blocks (see Figure [ 8 |. The block Rover Interface senses and 
controls the rover, that is, it establishes an interface to the 
hardware. The block Sensor Fusion takes the sensor values 
from the Rover Interface block, and provides the estimated 
pose of the rover to the high-level controller Engine. The 
Engine uses the GBN block to update the motion commands 
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Figure 7: Experimental platform: Pioneer Rover 


based on the estimated pose. Furthermore, the Engine maps 
the (higher level) motion commands to velocity commands 
needed by the Rover Interface to control the rover. 


for sampling the motion commands. Before starting the run 
we obtain the initial command vector from the distributions 
learned. The distribution of the first move Zi is independent 
from any other move and has the form A/”(/ri, cr\). Starting 
from the second move an, each motion depends on the previ¬ 
ous one: For motion number n, the distribution has the form 
JV(^„ — b nt n-i* x) l a “f led , cr^). The initial command vector 
is obtained by sampling from h, and each subsequent com¬ 
mand vector is obtained by taking into account the previous 
sample, when sampling from its own distribution. 

As the rover and its environment are uncertain (e.g., sensors 
are disturbed by noise) we use the pose provided by the 
sensor fusion unit to update the motion commands. Hence 
the motion commands are constantly adapted to take into 
account the actual driven distance (which could be different 
from the planned one due to the aforementioned uncertainty 
of the CPS). This allows us to incorporate the results of the 
sensor fusion algorithm in the updated commands. 



Figure 8: Parking system architecture 


5.1.1 The Gaussian Bayesian Network Block 
The goal of the GBN block in Figure[8j is to generate motion 
commands for the Engine to execute. A motion command 
corresponds to a driving distance or a turning angle. 

During the learning phase, the distributions of the random 
variables (RVs) in the Gaussian Bayesian network (GBN) in 
Figure [4] are collected in a CSV file of following format: 

motionType , motionDirection , mean , variance , 
dependenceCoefficient 


Parsing the CSV file initializes the GBN that will be used 


5.1.2 The Engine Block 

During the run we execute a motion command according to 
the semantics of the nwhile loop. In particular, the esti¬ 
mated pose is passed from the Sensor Fusion block to the 
Engine and compared with the target location, specified as 
a point on a 2-D plane. Since the rover is affected by noise 
its path can deviate and never come to the target location. 
To be able to detect and overcome this problem we estimate 
the scalar product of two vectors: The first one is the initial 
target location, and the second one is the current target lo¬ 
cation. This product is monotonely decreasing and becomes 
negative after passing the goal even on a deviating path. In 
an nwhile statement we monitor the distance (or angle) and 
detect if we should process the next command. 

To obtain the current state of the rover (its pose), and send 
velocity commands, we start two separate threads: (i) Re¬ 
ceive the pose and (ii) Send the velocity command. The 
motion command (containing desired driving distance or 
turning angle) is converted to a suitable velocity or steering 
command respectively, for the Rover Interface. After each 
executed command, we resample the pose in order to take 
into account actual driving distance in subsequent moves. 

5.1.3 The Rover Interface Block 

The block Rover Interface implements the drivers for sen¬ 
sors and actuators. The wheel velocities are measured by en¬ 
coders, already supplied within the Pioneer rover. A built-in 





















microcontroller reads the encoders and sends their value to 
the Carma Devkit. Additionally the rover is equipped with 
an inertial measurement unit (IMU) including an accelerom¬ 
eter, measuring the linear acceleration, and a gyroscope, 
measuring the angular velocity of the rover. The Raspberry 
Pi mounted on top of the rover samples the accelerometer 
and gyroscope, and forwards the raw IMU measurements 
to the Carma Devkit. The rover is controlled according to 
the incoming velocity commands containing desired linear 
velocity into forward direction (x-translation) and the de¬ 
sired angular velocity (z-rotation). The desired translation 
and rotation is converted to the individual wheel velocities, 
which are sent to and applied by the built-in microcontroller. 
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5.1.4 The Sensor Fusion Block 
Parking is often performed by applying predefined rules 
or following a specific trajectory [22]. So typically an au¬ 
tonomously parking car stops at a specific position beside 
a parking space and then turns and moves for a fixed time, 
angle or distance. The car has to stop, move and turn ex¬ 
actly as designated to park successfully. The car has to be 
aware of its current pose, that is, position and heading, oth¬ 
erwise parking will most likely fail (whatsoever controller is 
used). However, the current pose is observed by sensors, 
which suffer from uncertainty. Measurements are distorted 
by noise, e.g., caused by fluctuations of the elements of the 
electrical circuit of the sensors. The environment may be 
unpredictable, e.g., the car may slip over water or ice when 
parking. To overcome such problems sensor fusion tech¬ 
niques are applied, i.e., several sensors are combined to es¬ 
timate a more accurate state. A common method is state 
estimation (also called filtering) [23||30|. 


In this application, an unscented Kalman filter (UKF) [3l] is 
used. This filter combines the measurements listed in Table 
[l] with a suitable model describing the relations from the 
measured variables to the pose of the car. 



Sensor 

Variance 

Vi 

left wheel’s velocity 

0.002 m/s 

V r 

right wheel’s velocity 

0.002 m/s 

a 

linear acceleration 

0.25 m/s 2 

UJ 

angular velocity 

0.00017 rad/s 


Table 1: Used sensors and its variances. 


The belief state maintained by the UKF, e.g., the current lin¬ 
ear velocity, will be continuously updated: (i) By predicting 
the state, and (ii) By updating the prediction with measure¬ 
ments. For example, the linear velocity will be predicted by 
the current belief of acceleration and the time elapsed since 
the previous estimation. Next, the measurements from ac¬ 
celerometer and wheel encoders are used to update the pre¬ 
dicted velocity. Because the wheel encoders are much more 
accurate than the acceleration sensor (see variance in Ta¬ 
ble [TJ, the measurements from the wheel encoders will be 
trusted more (for simplicity one can think of weighting and 
averaging the measurements, where the particular weight 
corresponds to the reciprocal variance of the sensor). How¬ 
ever, by using more than one sensor, the unscented Kalman 
filter reduces the error of estimated and actual velocity. 


5.2 Integration into ROS 

ROS [26] is a meta-operating system that provides common 
functionality for robotic tasks including process communi¬ 
cation, package management, and hardware abstraction. A 
basic building block of a system in ROS is a so-called node, 
that performs computation and exchanges information with 
other entities. Nodes communicate with each other sub¬ 
scribing for or publishing messages to specific topics. So all 
the nodes subscribed to a particular topic A, will receive 
messages from nodes publishing to this topic A. 

Since the application is implemented in ROS we use the 
utility roslaunch to start the required ROS nodes (as shown 
in Figure]^ corresponding to the blocks given in Section 

Rover Interface: The node RosAria is used to control the 
velocity of the rover and provide the values of the wheel 
encoders for the sensor fusion node. The ROS nodes 
imu3000 and kxtf9 running on the Raspberry Pi pro¬ 
vide data from acceleromenter and gyroscope. 

Sensor Fusion: sf_filter node reads sensor values, im¬ 
plements the sensor fusion algorithm and provides the 
estimated pose of the rover. 

GBN and Engine: pioneer_driver is a node implement¬ 
ing resampling of commands based on the actual driven 
distance and constantly providing the required velocity 
commands to the RosAria node (see Figure |9|. 


5.1 



Figure 9: Parking system in ROS. 


5.3 Results 

After the learning phase, we obtain the parameters of the 
GBN that we use in the program skeleton (Table. [2] . Since 
we track the position using the data from the sensor fusion 
and each movement has the experimentally learned uncer¬ 
tainty, we are resistive to the perturbation of the actual 
driving distances and angles. If in the current distance of 
the robot deviates from the planned one, the commands, re¬ 
sampled from the GBN will try compensate the deviation 
with the dependencies obtained from the learning phase. 


- 


621 

0.7968 

632 

-0.2086 

&43 

0.5475 

oi 

0.0062 

02 

0.0032 


0.0019 

04 

0.022 

654 

-0.0045 

&65 

1.1920 

&76 

-0.0968 



05 

0.0008 

°6 

0.0178 

07 

0.0013 




Table 2: BN variances and coefficient dependences. 





































6. RELATED WORK 

Although probabilistic programs (PP), Gaussian Bayesian 
networks (GBN) and neural networks were considered be¬ 
fore, to the best of our knowledge, the development of smooth 
control statements, related within a GBN ontology is new. 
The ontology represents the knowledge of the PP about both 
its environment and its own control logic. 


Probabilistic programs are represented by different languages 
and frameworks [9 13 14 . The authors in [13] differentiate 
probabilistic programs from “traditional” ones, by the abil¬ 
ity to sample at random from the distribution and condition 
the values of variables via observation. Although they con¬ 
sider both discrete and continuous probability distributions, 
and transformation of Bayesian Networks and Discrete Time 
Markov Chains to probabilistic programs, they do not men¬ 
tion probabilistic control structures linked in GBN. 


In [§], the authors adapted the signal and image process¬ 
ing technique called Gaussian smoothing (GS), for program 
optimization. Using GS, a program could be approximated 
by a smooth mathematical function, which is in fact a con¬ 
volution of a denotational semantics of a program with a 
Gaussian function. This approximation is used to facilitate 
optimization for solving the parameter synthesis problem. 
In [7] this idea was developed further and soundness and ro¬ 
bustness for smooth interpretation of programs was defined. 
In both papers the authors do not consider any means for 
eliminating the re-normalization step of the probability den¬ 
sity function when a variable is passed through a conditional 
branch in the current execution trace. Moreover, they stop 
short of proposing new, smooth control statements. 


Learning Bayesian Networks comprises different tasks and 
problem formulations: i) Learning the structure of the net¬ 
work, ii) Learning the conditional probabilities for the given 
structure and in) Performing querying-inference for a given 
Bayesian Network [24]. In [TH] the authors introduce a uni¬ 
fied method for both discrete and continuous domains to 
learn the parameters of Bayesian Network, using a combi¬ 
nation of prior knowledge and statistical data. 


Various formulations of a mobile parking problem were ex¬ 
tensively studied for robots with different architectures Ell 
|19||22|[29] , For instance, in [22] the authors use a custom 
spatial configuration of the ultrasonic sensors and binaural 
method to perceive the environment and park the robot us¬ 
ing predefined rules. In 19 the authors approximate the 
trajectory for the parking task with a polynomial curve, 
that the robot could follow with the constraints satisfied, 
and used fuzzy controller to minimize the difference between 
specified trajectory and actual path. 


In order to govern a physical process (e.g., parking a car), 
the controller must be aware of the internal state of the 
process (e.g., the position of the car). Sensors measure the 
outputs of a process, whereof the state can be estimated. 
However, the measurements are distorted by noise and the 
environment may be unpredictable. State estimators [3]|23| 


used methods to increase the confidence of the state estimate 
evaluated out of raw sensor measurements. 


30 and in particular Kalman filters [30 31 are commonly 


7. CONCLUSION 

In this paper we introduced deep neural programs (DNP), 
a new formalism for writing robust and adaptive cyber- 
physical-system (CPS) controllers. Key to this formalism is: 

(i) The use of smooth Probit distributions in conditional and 
loop statements, instead of their classic stepwise counter¬ 
parts, and (2) The use of a Gaussian Bayesian network, for 
capturing the dependencies among the Probit distributions 
within the conditional and loop statements in the DPN. 

We validated the usefulness of DPNs by developing, once 
and for all, a parallel parking CPS-controller, which is able 
to adapt to unforeseen environmental situations, such as a 
slippery ground, or noisy actuators. No classic program has 
such ability: One would have to encode all this unforeseen 
situations, which would lead to an unintelligible code. 

In future work, we plan to explore the advantages of DPNs 
in the analysis, as well as, in the design (optimization) of 
CPS controllers. The nice mathematical properties of DPNs 
make them an ideal formalism for these tasks. 
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